Duration - 2 hours 30 minutes
We’re coming into the final stretch of ggplot2 basics. We can now produce informative plots, but we also want our plots to be clear, and just to look good.
We’ll look first at customising colour schemes, which ties in to our previous lesson on scale. We’ll also touch on shape scales. After colour, we’ll look at the related topic of legends. Lastly, we’ll look at themes which covers the very basic design elements of your plot - the fontface, the size of elements, the background, the grid lines, the axis lines, etc.
As we discussed in the first lesson, scales give the details of the mapping from the data to the aesthetic qualities, e.g. if the aesthetic mapping is aes(colour = sex) then it’s the scale that defines which colours are assigned to which sex:
| Sex | Colour |
|---|---|
| Male | Green |
| Female | Pink |
or if the aesthetic mapping is aes(colour = age) then it is likely a spectrum of colours corresponding to age:
| Age | Colour |
|---|---|
| 1 | Light blue (high luminance) |
| 12 | Dark blue (low luminance) |
| 1-12 | Intermediate shade of blue in proportion to proximity of age to 1 and to 12. |
Again, ggplot2 will detect the data type (discrete or continuous), create default scales for you, but sometimes you want to define your own scale, e.g.
| Sex | Colour |
|---|---|
| Male | Blue |
| Female | Red |
| Age | Colour |
|---|---|
| 1 | Blue |
| 10 | White |
| 12 | Red |
| 1-10 | Blend of blue and white in proportion to proximity to 1 (blue) and 10 (white) |
| 10-12 | Blend of white and red in proportion to proximity to 10 (white) and 12 (red) |
R uses hexadecimal to represent colours. Hexadecimal is a base-16 number system used to describe colour.
Red, green, and blue are each represented by two characters (#rrggbb). Each character has 16 possible symbols: 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F.
e.g. red= #FF0000 , black=#000000, white = #FFFFFF
Two additional characters (with the same scale) can be added to the end to describe transparency (#rrggbbaa).
R has a wide variety of predefined colour names available which can be used in place of a hex-code. You can see a list of names easily by typing colours().
The hcl function generates hexidecimal colours, from HCL numbers.
hcl(200, 50, 50)
## [1] "#00878F"
We’ll look at three main types of colour gradients:
We’ll also consider:
which allows you to use apply a set of predefined scales.
This is the most basic colour gradient. There’s two functions that let us do this: one applies to colour aesthetics, one applies to fill aesthetics.
scale_colour_gradientscale_fill_gradientWe’ve touched on these functions briefly in the first lesson yesterday. Now we can extend our use of them by controlling how they scale the colours. To do this, we need to give the high and low colour for the gradient in each function. As noted, we can use RGB codes, or we can just use named predefined colours, e.g.
library(ggplot2)
library(CodeClanData)
ggplot(pets, aes(weight, age, colour = sleep)) +
geom_point() +
scale_colour_gradient(low = "gray0", high = "gray100")
This gives us a grey scale plot.
Task - 5 mins
Look up the help documentation for the functions above (or consult the ggplot cheatsheet). Can you change the colour scale to something else?
Example Answer
ggplot(pets, aes(weight, age, colour = sleep)) + geom_point() + scale_colour_gradient(low = "light blue", high = "dark blue")![]()
If two colours is good then three must be better? In this case, yes. Using three colours gives us greater control over the output. The classic application is a divergent scale like we showed at the start. Essentially we construct 2 two-colour gradients, and define a midpoint value where the two gradients meet. The functions used are:
scale_colour_gradient2scale_fill_gradient2We’ve seen this applied above, but we repeat it here.
ggplot(pets, aes(weight, age, colour = sleep)) +
geom_point() +
scale_colour_gradient2(midpoint = 15, low = "blue", high = "red", mid = "white")
Task - 5 mins
Look at the plot above. What does it tell you? Write your interpretation of it down. Do you think this colour scale is the best for it? If no, what would you change the colour scale to?
If three is better, then n > 3 must be better? Not this time.
Typically we use n-colour gradients only when the colours are meaningful to our data in some way, e.g. standard terrain colours on maps, or if you’d like to use a palette produced by another package, e.g. colorspace.
scale_colour_gradientnscale_fill_gradientne.g. Maunga Whau (Mt Eden) is one of about 50 volcanos in the Auckland volcanic field. This data set gives topographic information for Maunga Whau on a 10m by 10m grid. Rather than write a list of colours ourself we are taking one from the library colorspace.
ggplot(volcano) +
geom_raster(aes(x, y, fill = height), interpolate = TRUE) +
scale_fill_gradientn(colours = colorspace::terrain_hcl(5))
Task - 5 mins
Look at the plot above. What does it tell you? Write your interpretation of it down.
What happens if you change the number of colours you are sampling from the
terrain_hclcolour pallette. Do you think this colour scale is the best for it? What happens if you take more or fewer colours from it (e.g. changeterrain_hcl(5)toterrain_hcl(10)and see what happens). What options would you choose for this plot and why?
Colour brewer is a popular set of colour schemes. You can see the avalible schemes at their website: http://colorbrewer2.org/
ggplot2 includes functions that can automatically select these palettes by name.
scale_colour_distillerscale_fill_distillerLet’s plot our volcano data again, this time using scale_fill_distiller instead.
ggplot(volcano) +
geom_raster(aes(x, y, fill = height), interpolate = T) +
scale_fill_distiller(palette = "RdPu")
As you can see, it chooses the colours based on whatever pallette you’ve chosen.
Task - 5 mins
Below is a heatmap that shows the maximum tempreature in Scotland for each month of year from 1910 to 2015.
ggplot(temp_df) + geom_raster(aes(x = month, y = year, fill = max_temp))
Change the colour scheme to
- A different two colour gradient
- A three colour gradient
- A n colour gradient
- One of colour brewer’s gradients
Which colouring do you think is best?
Answer
plot <- ggplot(temp_df) + geom_raster(aes(x = month, y = year, fill = max_temp))
plot + scale_fill_gradient(high = "red", low = "blue")
plot + scale_fill_gradient2(high = "red", mid = "white", low = "blue", midpoint = mean(temp_df$max_temp))
plot + scale_fill_gradientn(colours = colorspace::diverge_hcl(7)) # using a different pallet from colour space
plot + scale_fill_distiller(palette = "RdYlGn")
Number 2 is probably the clearest, and relates to the data by setting the midpoint as the mean of the data.
The functions we looked at above work with continous data points. There are also four discrete colour scales.
scale_fill_hue()scale_colour_hue()Picks evenly spaced hues around the HCL colour scale. (You’ll learn more about the HCL scale in the next lesson!)
ggplot(pets) +
aes(x = animal, fill = interaction(sex, animal)) +
geom_bar() +
scale_fill_hue()
We can choose different ranges to map over.
ggplot(pets) +
aes(x = animal, fill = interaction(sex, animal)) +
geom_bar() +
scale_fill_hue(h = c(120, 300), c = 40, l = 45)
The second option:
scale_colour_brewer(palette)scale_fill_brewer(palette)Uses the named colourbrewer2 palettes.
ggplot(pets) +
aes(x = animal, fill = interaction(sex, animal)) +
geom_bar() +
scale_fill_brewer(name = "Sex and Animal Type", palette = "Set1")
The third option:
scale_colour_grey(start = 0, end = 1)scale_fill_grey(start = 0, end = 1)uses shades of grey (from black = 0 to white = 1). You can use start and end to restrict the range of greys to be included.
ggplot(pets) +
aes(x = animal, fill = interaction(sex, animal)) +
geom_bar() +
scale_fill_grey()
ggplot(pets) +
aes(x = animal, fill = interaction(sex, animal)) +
geom_bar() +
scale_fill_grey(start = 0, end = 0.5)
ggplot(pets) +
aes(x = animal, fill = interaction(sex, animal)) +
geom_bar() +
scale_fill_grey(start = 0.5, end = 1)
The final option:
scale_colour_manual(values=c())scale_fill_manual(values=c())allows users to define their own colour palettes
ggplot(pets) +
aes(x = animal, fill = interaction(sex, animal)) +
geom_bar() +
scale_fill_manual(
values = c(
"F.Cat" = "red",
"M.Cat" = "blue",
"F.Dog" = "green",
"M.Dog" = "yellow"
)
)
or use premade colour palettes from outside ggplot2
library(wesanderson)
ggplot(pets) +
aes(x = animal, fill = interaction(sex, animal)) +
geom_bar() +
scale_fill_manual(
values = wes_palette("GrandBudapest1")
)
Task - 5 mins
The plot below shows the changing make-up of the average Chinese-person’s diet.
ggplot(chinesemeal) + aes(x = Year, colour = FoodType, y = CaloriesPerDay) + geom_line()
Can you change the colours in this plot so that.
- They are the same as the default, but with lower luminencece.
- They use a colour brewer scheme
- They use a grey colour scheme
- They are set to some manual value chosen by you.
Answer
chinese_meals_plot <- ggplot(chinesemeal) + aes(x = Year, colour = FoodType, y = CaloriesPerDay) + geom_line()
chinese_meals_plot + scale_colour_hue(l = 45)
chinese_meals_plot + scale_colour_brewer(palette = "Set3")
chinese_meals_plot + scale_colour_grey()
chinese_meals_plot + scale_colour_manual(values = wes_palette("Moonrise3"))
Legends show discrete colours/shapes/sizes, whereas colour bars display a continuum.
ggplot2 generates these automatically based on the scale but we can make minor adjustments. Let’s start look back at our plot from above:
chinese_meals_plot
We can override the default guides using the guide argument in the scale function, and the following guide functions:
guide_legend()guide_colourbar()The following arguments are available:
nrow or ncol allows us to specify the layout of the key elements as a grid; byrow used in combination determines whether key elements are ordered by row (byrow = TRUE) or by column (byrow = FALSE)reverse reverse the order in which the key elements are displayedkeywidth and keyheight allows us to alter the size of the keys (that is, the lines or symbols beside the legend labels)Let’s change the position of the legend:
chinese_meals_plot +
scale_colour_hue(guide = guide_legend(nrow = 2, byrow = TRUE))
Now let’s change the key defaults:
chinese_meals_plot +
scale_colour_hue(guide = guide_legend(nrow = 2, byrow = TRUE, keywidth = 1))
Task - 2 mins
Have a play around with the
keywidthandkeyheightarguments to see what it does to the legend.
Themes are concerned with the purely stylistic. e.g. font size, title placement, legend placement, grid lines, axis lines, background colour, etc.
This doesn’t mean they’re unimportant - clarity and visual attractiveness are vital - but if we’re using themes we’re not doing this for our own benefit - we’re doing this for the audience.
We’ve used these periodically throughout the course not explaining what they do. You may have worked out what they’re doing to some extent.
We’ll look at the standard themes available quickly, then we’ll look at the details of creating your own themes. The latter can be applied to adjust the standard themes.
The standard themes are:
theme_grey() the default theme - the grey background with white grid lines was designed to put the data forward whilst facilitating comparisons (by minimising the visual impact of grid lines)theme_bw() a minor variation on the default that uses a white background and thin grey lines - personally I prefer this one to the defaulttheme_linedraw() white background and black lines (like a line drawing apparently)theme_light() a minor variation of theme_linedraw that uses light grey linestheme_dark() a variation on theme_linedraw that uses a dark background (useful for making thin coloured lines ‘pop out’)theme_minimal() a mininmalistic theme with no background annotationstheme_classic() reminiscent of base r with no grid linestheme_void() a completely empty themeAll of these themes have a base_size parameter which controls font size. The themes automatically adjust font sizes for different annotations (e.g. title x1.2, tick labels x0.8). If you want more precise control of the font sizes you’ll need to adjust the individual elements as discussed in the next section.
It’s highly advisable to give consideration to base_size - ggplot2 generally produces good plots in the default but font size can be too small for some applications.
Task - 2 mins
Try and apply different themes to your
chinese_meals_plot.
Example Answers
chinese_meals_plot + theme_classic()
chinese_meals_plot + theme_minimal()
chinese_meals_plot + theme_light()![]()
Themes are made up of theme elements. There are vast number of theme elements, but most can be set using four functions:
element_textelement_lineelement_rectelement_blankLet’s go through each and look at some examples of each.
element_text formats text, such as titles and axes labels. Basic adjustments to size, face and colour. Font face options include c("plain","bold","italic","bold.italic").
Let’s add a title to our chinese_meals_plot and then format it.
chinese_meals_plot +
labs(title = "Typical diet of a Chinese Citizen") +
theme(title = element_text(size = 20, face = "bold", colour = "red"))
Wow, that’s bold. You can find more details in vignette(ggplot2-specs). Setting the font face can be frustrating without the names to refer to.
Task - 2 mins
Try and format your text to something more suitable.
element_line() draws lines by colour, size and linetype. For example, grid lines or axis lines.
Let’s add some line formatting to our plot.
chinese_meals_plot +
theme(
panel.grid.major = element_line(
colour = "black",
size = 2,
linetype = "solid"
),
panel.grid.minor = element_line(
colour = "red",
size = 1,
linetype = "dotted"
)
)
Task - 2 mins
Wow, the plot above is now extremely terrible. Try and adjust the line formatting to something reasonable instead.
element_rect draws rectangles, mostly used for background fill and border colour.
chinese_meals_plot +
theme(plot.background = element_rect(fill = "grey76", colour = NA))
chinese_meals_plot +
theme(plot.background = element_rect(fill = "black", colour = "yellow"))
chinese_meals_plot +
theme(plot.background = element_rect(fill = "black", colour = "yellow", size = 4))
element_blank draws nothing. It is used to override the default when you don’t want an element in your graph. For example, imagine you had a plot which you wanted to remove the title from. You can do:
# save the plot with a title
chinese_meals_plot2 <- chinese_meals_plot +
labs(title = "Typical diet of a Chinese Citizen")
chinese_meals_plot2
# remove title
chinese_meals_plot2 +
theme(plot.title = element_blank())
Setting up a theme in ggplot2 can be a lengthy and tedious grind, working through all of these elements. However:
ggthemes.It’s good practice to go through all of this, but in practice you (hopefully) won’t use it all too many times!
Task - 15 mins
Here is a basic plot. The client prefers dark backgrounds, but otherwise you’re free to display it as you like.
Make it look as good as you can.
ggplot(scottish_exports) + geom_line(aes(x = year, y = exports, colour = sector)) + facet_wrap(~sector, scales = 'free_y')
Answer
# An example of an answer library(dplyr) scottish_exports$sector <- as.character(scottish_exports$sector) scottish_exports$sector <- case_when( scottish_exports$sector == "Mining_Quarrying" ~ "Mining/Quarrying", scottish_exports$sector == "Agriculture_Forestry_Fishing" ~ "Agriculture/Forestry/Fishing", TRUE ~ scottish_exports$sector )ggplot(scottish_exports) + geom_line(aes(x = year, y = exports, colour = sector), size = 1) + facet_wrap( ~sector, scales = 'free_y' ) + labs( title = "Scottish Exports through Time", x = "\nYear", y = "Export Volume\n" ) + scale_colour_brewer(guide = FALSE, palette = "Set2") + theme_minimal(12) + theme( panel.background = element_rect(fill = "#0f1966"), panel.grid = element_blank(), plot.title = element_text(size = 18) )
Answer
Answer
Continuous:
scale_colour_gradient - two colour gradientscale_colour_gradient2 - three colour gradient (such as diverging scales)scale_colour_gradientn - n colour gradient (typically imported palettes)scale_colour_distiller - colourbrewerDiscrete:
scale_colour_hue - equally spaced HCL coloursscale_colour_brewer - colourbrewerscale_colour_grey - shades of greyscale_colour_manual - assign colours to values by hand, or import palette, e.g. wesandersonAnswer
We can change these using the theme() function, or the various predefined theme functions such as theme_bw() or theme_dark().
Answer
A few of these are conventional vectors or unit() data types, but the majority are setup using the following types:
geom_text : font face, size, etc. of textgeom_rect : rectangle fill and outlinegeom_line : line colour, style, size, etc.geom_blank : eliminates an elementLinks of where else to look